Selection of keyword phrases for providing contextually relevant content to users

ABSTRACT

A process is described for assessing the suitability of particular keyword phrases for use in serving contextually relevant content for display on pages of network-accessible sites. In one embodiment, the process involves scoring the key phrases based in part on collected user behavioral data, such as view counts of associated social media content items. A process is also disclosed in which selected keyword phrases on a page are transformed into links that can be selected by a user to view bundled content that is related to such keyword phrases.

RELATED APPLICATIONS

This application is a continuation U.S. application Ser. No. 15/219,010,filed Jul. 25, 2016, now pending, which is a continuation of U.S.application Ser. No. 14/546,744, filed Nov. 18, 2014, now U.S. Pat. No.9,418,374, which is a continuation of U.S. application Ser. No.13/478,002, filed May 22, 2012, now U.S. Pat. No. 8,909,639, which is acontinuation of U.S. application Ser. No. 13/174,296, filed Jun. 30,2011, now U.S. Pat. No. 8,209,333, which is a continuation of U.S.application Ser. No. 12/016,887, filed Jan. 18, 2008, now U.S. Pat. No.8,073,850, which claims the benefit of U.S. Provisional Appl. No.60/885,853, filed Jan. 19, 2007. Each of the disclosures of theaforesaid applications is hereby incorporated by reference in itsentirety.

FIELD OF THE INVENTION

The present invention relates to computer-implemented processes foridentifying the terms and/or phrases most suitable for servingcontextually relevant content. The invention also relates to processesfor serving contextually relevant content for display within web pagesor other types of documents.

BACKGROUND

A variety of systems exist for selecting content, such asadvertisements, to present on web pages based on the content of such webpages. These systems often fail to select content that is relevant to,or suitable for display on, the particular page at issue. For example,an ad for a particular product or company may be selected to display ona web page containing an article that is critical of that product orcompany. As another example, an ad for a particular product may bedisplayed on a page containing an article about a completely unrelatedtopic merely because the product is briefly mentioned in the article.Existing systems also frequently display the selected ad in a mannerthat is distracting to users. These and other issues contribute to a lowindustry-wide click through rate of less than 1%.

SUMMARY OF THE DISCLOSURE

A process is described for assessing the suitability of particularkeyword phrases for use in serving contextually relevant content fordisplay on pages of network-accessible sites. In one embodiment, theprocess involves scoring the key phrases based in part on collected userbehavioral data, such as view counts of associated social media contentitems. A process is also disclosed in which selected keyword phrases ona page are transformed into links that can be selected by a user to viewbundled content that is related to such keyword phrases.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a system for identifying and serving contextuallyrelevant content according to one embodiment.

FIG. 2A-2C illustrates examples of a panel formats that may be used bythe system of FIG. 1 to serve bundled content.

FIG. 3 illustrates how a panel may be presented on a web page inresponse to user selection of a keyword phrase that has been transformedinto a user-selectable link.

FIG. 4A illustrates one embodiment of a process that may be implementedby the indexing engine of FIG. 1 to identify key phrases that arerelevant to a particular URL or page.

FIG. 4B depicts a peer group of web pages associated with a target page.

FIG. 4C illustrates a user-activity-based method for selecting keyphrases to use for serving contextually-relevant content.

FIG. 5 illustrates one embodiment of a process performed by the systemof FIG. 1 when an advertiser creates and ad campaign.

FIG. 6 illustrates one example of a form page that may be used to createthe ad campaign.

FIG. 7 illustrates how selected key phrases may be presented to theadvertiser via a campaign cloud interface in the process of FIG. 5.

FIG. 8 illustrates a campaign summary page.

FIG. 9 illustrates a campaign purchase form.

FIG. 10 summarizes the process depicted by FIGS. 5-9.

FIG. 11 illustrates one embodiment of a process that occurs when an enduser loads a panel-enabled web page of a publisher site 34.

DETAILED DESCRIPTION OF THE SPECIFIC EMBODIMENTS

A system that embodies various inventive features will now be describedwith reference to the drawings. Nothing in this description is intendedto imply that any particular feature, characteristic, or component ofthe system or its use is essential to the invention.

I. Overview

FIG. 1 illustrates the basic components of a system 30, referred to asthe Intextual system, for acquiring and serving contextually relevantcontent for display on web pages. The figure also shows the variousentities that interact with the Intextual system. These entities includethe following: advertisers that operate advertiser web sites 32;publishers that operate publisher web sites 34 (i.e., sites that publishor syndicate content served by the Intextual system 30); end users thataccess the publisher sites 34 via browser software running on end usercomputing devices (one computer/browser 36 shown); and social contentproviders that upload social content (e.g., photos, videos, music,textual content, etc.) to one or more social media sites 38.

As illustrated, the Intextual system 30 (hereinafter “the system”)includes a bundle server 40 that serves bundled content for displaywithin, or in conjunction with, web pages 42 of the publisher sites 34.The bundled content is preferably displayed in association with specifickey phrases that appear in pages of the publisher sites 34. (As usedherein, a “key phrase” can be either a single term or a sequence of twoor more terms.) As described below, the system 30 uses a novel processto select the key phrases that are likely to be the most effective forparticular ads, advertisers and publishers. This process involvesanalyzing social media content obtained from one or more social mediasites 38 to assess the popularity levels of particular key phrases.

As depicted in FIG. 1, the bundled content is preferably displayed in apanel 44 that appears when the user clicks on, or in some embodimentshovers a mouse cursor over, the corresponding key phrase (which isdisplayed as a special link) in a web page 42 of a publisher site 34.The panel 44 preferably occupies a portion of the browser's main viewingarea, such that a portion of the web page (including the key phrase) isstill visible. As shown in FIGS. 2A, 2B and 3, the panel in oneembodiment includes (1) an area that displays thumbnail images of videosrelated to the key phrase; (2) an area that displays one or moreclickable advertisements related to the key phrase, and (3) an area thatdisplays thumbnail images of one or more photos related to the keyphrase. FIG. 2C illustrates the panel's configuration when it is dockedat the bottom of the browser's main viewing area.

FIG. 3 illustrates how the panel 44 may be presented on a web page 42 inresponse to user selection of a corresponding key phrase (“Atari 2600”)that has been transformed into a link. The selected key phrase isdisplayed at the top of the panel. The panel 44 also displays photos andvideo images associated with the key phrase. Because no advertisementsare associated with the selected keyword phrase in this example, noneare shown. If the user selects another highlighted key phrase on thepage 42, the embedded JavaScript updates the panel 44 with contentcorresponding to that key phrase. As described below, the rate at whichusers click on a particular highlighted key phrase may be monitored bythe system 30, and used as one factor to assess whether this key phraseshould continue to be transformed into a link. This assessment may beperformed at the publisher page level, the publisher site level, and/ora global level.

The videos and photos are preferably obtained from social media sites 38such as YouTube™, Flickr™ and Myspace™, and may be obtained viainterfaces to these sites or by using a crawling/scraping process. Ifthe user clicks on a thumbnail of a video, the panel 44 expands on thescreen (expanded views not shown), loads the corresponding web page fromthe associated social media site, and begins to play the video asdisplayed on that page. If the user clicks on a thumbnail image of aphoto, the panel 44 expands and loads the corresponding photo page ofthe associated social media web site. If the user clicks on an ad, thepanel expands and loads the corresponding landing page of theadvertiser's web site. (As discussed below, the system tracks, andcharges the corresponding advertisers for, such ad selection events.) Ineach of these scenarios, the user can click on additional links withinthe panel 44 to navigate to other content (e.g., to other videos orphotos of a social media site, or to other content of the advertisersite). If the user clicks on one of the links at the bottom of thepanel.

Thus, the panel 44 acts as a portal or mini browser that enables theuser to access social and sponsored content without navigating away fromthe publisher's web page 42. Because the key phrases are selected so asto correspond to current popular topics, the sponsored and socialcontent displayed within the panel 44 tends to be highly useful andrelevant to users.

As depicted in FIG. 1, a publisher can enable the display of the panel44 on a web page 42 by adding a tag to the web page's HTML coding. A webpage that includes such a tag is referred to herein as a “panel-enabled”page 42. The tag may be a JavaScript line or sequence that causes theuser's web browser 36 to request a JavaScript component from the bundleserver 40 (or from another server of the system). When the browser 36loads and executes this JavaScript component, the key phrase or phrasesappearing on the web page 42 are transformed into special links that canbe clicked on to cause the panel 44 to be displayed with contextuallyrelevant content. The JavaScript component is also responsible forcreating the display of the panel 44, and for populating the panel withthe bundled content associated with the key phrase. In otherembodiments, ActiveX™, Flash, or another type of scripting language orcontrol may be used in place of JavaScript.

As shown in FIG. 1, the system 30 may host a web site 50 that providesfunctionality for advertisers to create and manage ad campaigns. Oneexample of a process by which the advertisers create ad campaigns isdescribed below. Advertisers may additionally or alternatively create adcampaigns via the corresponding publisher sites 34 through an interfaceto the system 30. The system's web site 50 may also providefunctionality for publishers to register with the system to publishcontent served by the system.

As illustrated in FIG. 1, the system includes an indexing engine 52 thatis responsible for analyzing various sources of information for purposesof selecting appropriate key phrases. The information sources preferablyinclude the following: (1) web pages of the publisher sites 34, (2) webpages of the advertiser sites 32, (3) pages 54 (referred to as “peerpages”) of sites that have direct links to, or are the target of directlinks from, the publisher and advertiser sites, and (4) social mediasites 38.

The indexing engine 52 is preferably invoked in two primary scenarios.The first is when an advertiser creates an ad for display on aparticular publisher web site 34 or a group of publisher sites. In thisscenario, the indexing engine 52 is used to select and rank key phrasesto suggest to the advertiser. The second scenario is when an enduser/browser 36 loads a panel-enabled page 42 of a publisher site 34. Inthis scenario, if the page 42 has not recently been analyzed by theindexing engine 52, the indexing engine retrieves and analyzes the webpage 42 to identify key phrases to transform into links. Both of thesescenarios are described below.

The vanous components of the system 30 may be implemented as executablecode (software) executed by one or more general purpose computers. Thecomponents may reside at a common location, or may be distributedgeographically. The software code and associated data may be stored inany type or types of computer data repository (e.g., relationaldatabases, flat files, hard disk storage, RAM, etc.). The browsers 36may be any conventional web browser capable of executing JavaScript. Thevarious communications depicted in FIG. 1 and other drawings occur overthe Internet and/or other computer networks.

II. Operation of Indexing Engine

FIG. 4A illustrates one embodiment of a process that may be implementedby the indexing engine 52 to identify key phrases that are relevant to aparticular URL or page. This process is referred to herein as“Peer-based Language Processing,” or PLP, as it involves the processingof a peer group of web sites or web pages to identify relevant keyphrases. As will be recognized, the process shown in FIG. 4A can bevaried significantly without departing from the scope of the invention.

In block 1, the indexing engine 52 receives a URL for which to identifyrelevant key phrases. If the indexing process is triggered by advertisergeneration of an ad, this URL is typically the advertiser-specifiedlanding page URL of the ad. If the process is triggered by a userloading a panel-enabled web page 42 of a publisher site 34, the URL isthe URL of this panel-enabled web page. For ease of description, the URLreceived in block 1 will be referred to as the “target URL,” and thecorresponding web page will be referred to as the “target page.”

In blocks 2-4, the indexing engine 52 retrieves and parses the targetpage (block 2) to extract a set of key phrases (block 3) and a set ofoutbound link URLs (block 4). The key phrases are extracted in oneembodiment by stripping out HTML tags, and by using one or more languagefiles to remove stop words. As depicted by block 5, the indexing engine52 also accesses an external service/database, such as a web serviceprovided by Alexa Internet, to identify inbound link URLs (i.e., URLs ofweb pages that point to the target page and are not part of the targetpage's Internet domain). The inbound and outbound URLs form a “peergroup” for the target URL. As mentioned below, in scenarios in which theadvertiser is creating an ad campaign, the peer group may also includesome or all of the web pages of the advertiser's target site 32 (i.e.,the site that includes the landing page for the ad campaign).

As depicted by block 6, the indexing engine 52 retrieves and scans thecontent of the identified inbound and outbound URLs (and possiblyadditional pages of the peer group) to determine phrase frequency of theextracted phrases within the peer group. Key phrases having relativelyhigh frequencies of occurrence across the peer group tend to bestcharacterize the target page, and thus tend to be more useful thanless-frequently-occurring key phrases for serving context-relevantcontent.

FIG. 4B depicts one example of a peer group that may be formed for aparticular target page (T). In this example, the peer group includesthree types of additional pages: (1) pages that include a link to thetarget page, (2) pages to which the target page includes a link, and (3)other pages of the site of the target page. Typically, the peer groupwill include web pages of many different web sites.

Various other methods may be used to form the peer group. For example,web usage trails of users may be analyzed in aggregate to identify otherweb pages that are behaviorally related to (e.g., commonly accessedduring the same browsing session as) the target web page; thesebehaviorally related pages may then be used as, or used to supplement,the peer group.

As depicted by block 7 of FIG. 4A, the indexing engine 52 alsodetermines the popularity levels of the extracted key phrases. This maybe accomplished by, for example, determining the frequencies with whichthe extracted key phrases appear within a repository of social mediacontent, and/or determining how often media related to these key phrasesis viewed. This may be accomplished by, for example, scanning an indexof one or more social media web sites 38. Scraping methods can also beused to extract view counts for particular videos, pictures, and othercontent items on the social media sites 38. Key phrases that appearrelatively frequently in the social media and/or relate to content thatis frequently viewed typically correspond to relatively popular subjectsand topics, and thus tend to be more useful than theless-frequently-occurring key phrases. The social media sites 38 may beseparate and distinct from the advertiser sites 32 and publisher sites34.

The popularity levels of the key phrases may additionally oralternatively be assessed using other sources of information. Forexample, the key phrase popularity levels can be assessed by analyzingsearch query logs of an Internet search engine, a social media site,and/or a news site. As another example, a service such as Google™ Trendsmay be used to assess key phrase popularity trends, so that key phrasesrapidly gaining in popularity can be given more weight.

In addition, some social media sites 38 include publicly accessible APIsthat provide access to usage statistics regarding the frequencies withwhich particular key phrases are used to tag, or are used to search for,particular photos, videos or other content items on a social media site.YouTube.com is one example of a social media site that provides such anAPI. These APIs may be used as an additional or alternative source ofinformation for assessing the key phrase popularity levels in block 7.

For example, each key phrase may be scored or assessed based (or basedin-part) on its acceleration (i.e., the rate at which it is gaining inpopularity) in social media sites, or based on another popularitymetric, as determined via a mathematical scoring model that examinesuser activity over time. One example of such a model is depicted in FIG.4C. This figure depicts a graph representing the popularity level of aparticular key phrase over time. The vertical axis in this examplerepresents the current popularity level of the key phrase as measured byone, or any combination of, the following metrics: (1) the view countassociated with one or more corresponding media items (videos, photos,etc.) on a social media site or group of sites 38, (2) the number oftimes the key phrase has been submitted as a search query to one or moreInternet search engines, (3) the number of times the key phrase, astransformed into links on publisher sites 34 as described above, hasbeen selected. Each of these metrics (1)-(3) may be based on a mostrecent window or dataset of user activity data, such as the last 3 daysof user activity. The dashed box in FIG. 4C represents the period oftime during which the key phrase should generally be transformed intolinks on the publisher sites 34. Specifically, as the key phrase israpidly gaining in popularity, the system 30 may score the key phraserelatively highly to increase the probability that it will be selectedfor use in targeting content. As the acceleration begins to decline, thesystem 30 may assign a lower popularity score that decreases the keyphrase's probability of use. Rather than using a phrase scoring method,hard cutoffs could be used to enable and disable the key phrase's use.

In block 8, each extracted key phrase is scored by combining its peerfrequency with its social media frequency/popularity. An appropriateweighting method may optionally be used to give more weight to one typeof content (peer versus social) than the other. Various other criteriamay also be incorporated into the scores. For example, key phrases thatappear relatively infrequently across the entire web (or some otherreference document collection) may be scored more highly on the basisthat they better distinguish the peer group from the web or referencedocument collection as a whole. As another example, relatively popularweb pages in the peer group, and/or relatively popular social contentitems, may be weighted more heavily in measuring key phrase frequency.

The output of the indexing engine 52 is a set of name/value pairsrepresenting the extracted key phrases and their respective scores.These name/value pairs are stored in a database 60 of the system inassociation with the target URL. The most highly scored key phrases tendto be relatively popular key phrases that characterize the target siteor page, and which are the most useful for selecting contextuallyrelevant content.

If the target URL is a landing page of an advertisement, the extractedkey phrases with the highest scores are suggested to the advertiser aspart of a “campaign cloud” (see FIG. 7, discussed below). In thisscenario, the scores are preferably used to set corresponding CPC (costper click) and CPM (cost per thousand) rates for charging the advertiserfor ad click-through events. If, on the other hand, the target page is apanel-enabled page 42 of a publisher site 34, the extracted phrases andscores are used by the system 30 to select specific key phrases todisplay on the target page as special links.

As mentioned above, in ad generation scenarios in which the advertiserdesignates a particular publisher site 34, the process shown in FIG. 4Amay be varied to include the advertiser's target site in the peer groupof the publisher. This increases the likelihood that the extractedphrases suggested to the advertiser are phrases that actually appear(and ideally appear reasonably frequently) on the advertiser site 32.

As will be recognized, the PLP process described above can be used in awide range of applications, including applications that do not involvethe display of bundled content, and including applications in which thekey phrases on the publisher page are not converted into special links.The present invention encompasses such applications. As one example, thePLP process can be used in Google™ AdSense™ type applications in whichads are displayed in a designated area of the publisher web page.

III. Advertiser Use (FIGS. 5-10)

The process by which an advertiser interacts with the system to createan ad campaign will now be described with reference to FIGS. 5-10. Asrepresented by event A in FIG. 5, the advertiser initially uses anadvertisement form page 62 of the system 30 to create an ad. An exampleof such a form page is shown in FIG. 6. The ad preferably includes an adtitle, ad text, and the URL of a landing page. In this example, the adform is branded with the logo of a corresponding publisher, and the adis being created specifically for display by this publisher.

In event B, the indexing engine is invoked to generate a set of keyphrases and associated scores for the designated landing page. This maybe accomplished using the process shown in FIG. 4A, discussed above. Theresults are stored in a database 60 (event C) in association with theadvertisement.

In event D, a web page depicting a resulting “campaign cloud” isgenerated and displayed to the advertiser. An example of such a displayis shown in FIG. 7. In this display, the size of each key phrase isdirectly proportional to its score. The phrases with the lowest scores,or which have scores falling below a selected threshold, may be omittedfrom the cloud. The controls displayed with the campaign cloud enablethe advertiser to add and remove key phrases.

While viewing the campaign cloud, the advertiser can “mouse over” a keyphrase to view its estimated average CPC and average CPM. These valuesare preferably calculated based on the publisher's minimum CPC and CPMvalues, and based further on the key phrase scores. CPM and CPC ratesare calculated by simplifying the phrase scores to a scale of 1-10 andmultiplying the base cost set by the publisher by the score. Forexample, a publisher specifying a base CPM of $0.50 will equate to a$5.00 CPM for the most popular phrases in its peer group. Market trenddata, publisher traffic statistics and category information may befactored into the CPM value as well. Phrases entered by the advertiserwill be added to the cloud with their corresponding scores on thepublisher site. If no score exists or the phrase does not exist on thepublisher site, the phrase is assigned the publisher's minimum cost.

Upon proceeding from the campaign cloud page (event E in FIG. 5), theadvertiser is presented with a campaign summary page. An example of sucha summary page is shown in FIG. 8. Using this page, the advertiserenters and submits a desired budget. The advertiser then proceeds to acampaign summary purchase form (FIG. 9) and specifies paymentinformation. Upon completion of the payment process, the details of thecampaign are stored in an ad campaign database (event F).

The above-described process by which the advertiser interacts with thesystem to create an ad campaign is summarized in FIG. 10.

IV. Browser Loading of Panel-Enabled Page (FIG. 11)

FIG. 11 illustrates the process that occurs when an end user loads apanel-enabled web page 42 of a publisher site 34. In event 101, the tagincluded in the page causes the user's browser/computer 36 to send arequest to the system 30. In response to this request, the system 30accesses a database to determine whether the page is already indexed. Ifthe page is already indexed, the bundle server 40 returns a cacheableJavaScript component to the browser (event 102 a). This JavaScriptrequests the key phrases associated with this page, and causes thebrowser 36 to transform any occurrences of these key phrases intospecial links.

If the page is not indexed, the system invokes the indexing engine 52,which generates a dataset of key phrases and associated scores. Thisdataset is stored in a database 60 (event 102 b) in association with theURL of the panel-enabled page, and is maintained therein for a selectedperiod of time (e.g., 1 day or 1 week). Once the indexing is completedin this scenario, the process ends (i.e., no panel is generated).

Returning to the scenario in which the page is already indexed, if auser clicks on one of the special links/key phrases 66 (one shown inFIG. 11), the previously loaded JavaScript causes the browser 36 to senda bundle request to the bundle server 40 (event 3 a). (This request mayalternatively be made in advance of user selection of a special link,and maintained hidden within the web page unless/until the user selectsa special link.) This request identifies the URL of the web page 42being viewed, and may also identify the particular key phrase 66selected. The bundle server 40 responds to this request by identifyingthe key phrases associated with the publisher web page, identifying adsassociated with these key phrases (excluding any that do not meetcriteria pre-specified by the publisher), and by identifying the mostpopular social media items that match these keyword phrases. The bundleserver then assembles the ads, key phrases and social media items (orthumbnails thereof) into a bundle, and returns the bundle to the webbrowser 36 (event 104). A separate bundle request may alternatively begenerated for each key phrase/link 66 selected by the user.

The JavaScript displays the panel on an independent graphic layer of theweb page 42, and loads the panel 44 with the bundle content associatedwith the key phrase selected by the user. If the user thereafter selectsa different key phrase/special link on the page, the panel is reloadedwith the bundle content associated with that key phrase. The JavaScriptmay also be configured to draw ads and related content within other adunits.

The foregoing description of specific embodiments does not limit theinvention in any way. Other embodiments and applications that areapparent to those of ordinary skill in the art, including embodimentsthat do not provide all of the features and advantages set forth herein,are also within the scope of this invention. The scope of the presentinvention is intended to be defined only by reference to the followingclaims. Applicants also reserve the right to pursue additional claimsthat are supported by the present disclosure, including claims ofbroader scope.

What is claimed is:
 1. A memory device having instructions storedthereon that, in response to execution by a processing device, cause theprocessing device to perform operations to: input content of a targetdigital page into a first parser resource to recognize any one or morefirst links in the content of the target digital page; input informationabout the target digital page into a second resource to check for one ormore second links; identify a peer group of digital pages based onlink(s) recognized or obtained by the first parser resource or thesecond resource, wherein identify the peer group of digital pagesincludes comprises identify an outbound link from the target digitalpage or an inbound link to the target digital page; rank a set of keyphrases based on social media popularity of the key phrases of the setand correlation of the key phrases of the set to content of the digitalpages of the peer group; and generate, based on a result of the ranking,an interactive display, wherein one or more key phrases of the set areselectable in the interactive display to create an advertisement usinguser-selected ones of the one or more key phrases.
 2. The memory deviceof claim 1, wherein the peer group includes the target digital page.